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Abstract 

Background: Digitalis purpurea is an important ornamental and medicinal plant. There is considerable interest in 
exploring its transcriptome. 

Results: Through high-throughput 454 sequencing and subsequent assembly, we obtained 23532 genes, of which 
15626 encode conserved proteins. We determined 140 unigenes to be candidates involved in cardiac glycoside 
biosynthesis. It could be grouped into 30 families, of which 29 were identified for the first time in D. purpurea. We 
identified 2660 mRNA-like npcRNA (mlncRNA) candidates, an emerging class of regulators, using a computational 
mlncRNA identification pipeline and 13 microRNA-producing unigenes based on sequence conservation and 
hairpin structure-forming capability. Twenty five protein-coding unigenes were predicted to be targets of these 
microRNAs. Among the mlncRNA candidates, only 320 could be grouped into 140 families with at least two 
members in a family. The majority of D. purpurea mlncRNAs were species-specific and many of them showed 
tissue-specific expression and responded to cold and dehydration stresses. We identified 417 protein-coding genes 
with regions significantly homologous or complementary to 375 mlncRNAs. It includes five genes involved in 
secondary metabolism. A positive correlation was found in gene expression between protein-coding genes and 
the homologous mlncRNAs in response to cold and dehydration stresses, while the correlation was negative when 
protein-coding genes and mlncRNAs were complementary to each other. 

Conclusions: Through comprehensive transcriptome analysis, we not only identified 29 novel gene families 
potentially involved in the biosynthesis of cardiac glycosides but also characterized a large number of mlncRNAs. 
Our results suggest the importance of mlncRNAs in secondary metabolism and stress response in D. purpurea. 



Background 

Digitalis purpurea L. (common foxglove, purple foxglove 
or lady's glove) is an herbaceous biennial flowering plant 
species. It is native to Europe and has been widely intro- 
duced into other parts of the world. Now, this plant 
becomes naturalized in many countries, such as China, 
Canada, USA, and New Zealand [1]. D. purpurea is 
known for its campanulate flowers produced on a tall 
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spike. The flower color varies from purple to pink, yel- 
low or white. Because of the showy flowers, D. purpurea 
has an ornamental value and is cultivated world widely. 
However, this plant is highly poisonous to humans and 
may be fatal if ingested, so it is suggested to be banished 
from the table [2,3]. Although D. purpurea is toxic, it 
has a high medical value in the therapy of congestive 
heart failure, particularly arrhythmia for over 200 years 
[1]. Many patients benefited from the use of this plant 
and its extracts. The main active compounds of D. pur- 
purea are cardiac glycosides (cardenolides), which are a 
group of cardio-active agents with the ability to inhibit 
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Na + /K + -ATPase and have been proved to be the most 
effective drugs to treat heart failure during the past two 
centuries [1,4,5]. D. purpurea is also a potential source 
of anti-tumor drugs because of its high cytotoxicity 
against human cancer cell lines [6]. The discovery of 
cardio-active agents in D. purpurea by the 18th century 
Scottish physician William Withering is thought to help 
launching the development of modern pharmacology 
and the pharmaceutical industry, further suggesting the 
importance of D. purpurea to medicine [7]. 

D. purpurea tolerates many environmental stresses, such 
as cold, drought, heat, flooding, full sun and poor soil, 
although it prefers rich and well-drained acidic soils and 
half shade and moist environments. This plant can survive 
in most temperate climates of the world, including the 
cold temperate in Alaska of the United States and south- 
western Norway and the hot and drought conditions in 
some areas of Africa [1,8]. Being a biennial plant, D. pur- 
purea is able to withstand the winter cold and the summer 
drought [9]. Significant damage to D. purpurea leaves 
were happened only at -12°C or below and the threshold 
for buds and roots was as low as -15°C and -18°C, respec- 
tively [9] . Thus, D. purpurea seems to be an ideal species 
for studying plant responses to drought and cold stresses. 

Although D. purpurea exhibits significance in orna- 
ment, medicine and biological study, little has been done 
about genes involved in the growth and development of 
this plant species. No EST sequences are available and 
only 77 nucleotide records are found for D. purpurea in 
the NCBI nucleotide database http://www.ncbi.nlm.nih. 
gov/nucleotide/. The major efforts to study the molecular 
aspects of D. purpurea were focused on the purification 
and characterization of malonyl-coenzyme A: 21-hydro- 
xypregnane 21-O-malonyltransferase (Dp21MaT) and the 
identification of genes encoding progesterone 5p -reduc- 
tase (5P-POR) and A 5 -3p-hydroxysteroid dehydrogenase 
(3PHSD), which are involved in the biosynthesis of car- 
diac glycosides [5,10-12]. 

Non-protein-coding RNAs (npcRNAs), also known as 
noncoding RNAs (ncRNAs), are transcripts that have 
neither experimental nor evolutionary evidence for an 
open reading frame encoding a functional protein [13]. 
They are produced in both eukaryotes and prokaryotes 
and represent the vast majority of all transcripts in a cell 
as demonstrated by genomic tiling arrays, whole tran- 
scriptome analysis, reverse transcription-RCR (RT-PCR) 
and computational prediction [14-26]. In human, more 
than 90% of the total genome sequence is likely to be 
transcribed, but only about 1.5% of the genome codes for 
proteins, suggesting the abundance of human npcRNA 
species [15,27]. The high percentage of transcription in 
non-protein-coding regions was also reported for other 
eukaryotes, such as mouse, yeast and Drosophila melano- 
gaster [13,28-32]. Now, npcRNA is a research hotspot of 



RNA biology and the number of identified npcRNAs is 
increasing rapidly. 

Results from gene expression profiling and functional 
analysis strongly suggested that most of the npcRNAs 
could be biologically meaningful rather than merely tran- 
scriptional "noise" [13,32]. Based on the expression profiles 
and predicted functions, npcRNAs can be classified into 
two groups, including housekeeping npcRNAs and regula- 
tory npcRNAs [13]. Housekeeping npcRNAs are usually 
expressed constitutively and include rRNAs, tRNAs, 
snRNAs and snoRNAs, whereas regulatory npcRNAs are 
differentially expressed and developmentally regulated and 
can be subdivided into short regulatory npcRNAs (or 
small RNAs, sRNAs, < 40 nucleotides), such as micro- 
RNAs (miRNAs) and small interfering RNAs (siRNAs), 
and large regulatory npcRNAs (or long npcRNAs, 
IncRNAs, > 40 nucleotides) that may or may not be polya- 
denylated. A subset of IncRNAs, which can undergo spli- 
cing and are capped and polyadenylated as conventional 
mRNAs, are called mRNA-like npcRNAs (mlncRNAs) 
[33]. 

Recently, sRNAs, particularly miRNAs, have been 
intensely studied [34] . The number of identified miRNAs 
has reached to 16772 according to the most recent 
release of miRBase (release 17, http://www.mirbase.org/). 
It includes 3362 miRNAs from plants, 237 from viruses, 
and more than 13,000 from animals. In plants, miRNAs 
play crucial roles in organ development and defense 
responses by targeting other RNA molecules for cleavage, 
or, in a few cases, for translational repression [35-38]. 
IncRNAs are much less studied as compared with 
mRNAs and sRNAs and represent a major unexplored 
transcript species [13]. So far, only a small number of 
IncRNAs have been functionally characterized [32]. The 
known roles of IncRNAs in animals include transcrip- 
tional regulation, epigenetic gene regulation, and disease 
response [13]. Studies of plant IncRNAs were mainly 
concentrated on systematic searches of mlncRNAs in 
Arabidopsis thaliana, wheat, and Medicago truncatula 
[39-46]. Only few were identified in other plant species 
[47-53]. Several of the identified plant mlncRNAs, includ- 
ing Arabidopsis AtIPSl/At4 and COLDAIR, Cucumis 
sativus CsMlO, Zea mays Zm401 and M. truncatula 
enod40, were functionally characterized. The results sug- 
gested important regulatory roles of plant mlncRNAs in 
phosphate deprivation response, vernalization response, 
pollen development, gender-biased expression, and nodu- 
lation [47,49-57]. Functions of the other plant mlncRNAs 
are largely unknown. 

In this study, we performed a comprehensive analysis 
of D. purpurea transcriptome that had never been 
explored. We obtained a high-quality unigene set and 
determined conserved and non-conserved protein-coding 
genes, including candidates for about 75% of the total 
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cardiac glycoside biosynthesis-associated gene families in 
D. purpurea. We identified a large number of mlncRNA 
candidates and 13 microRNA-producing unigenes and 
predicted 25 microRNA targets. Using a comprehensive 
approach, we revealed some important characteristics of 
mlncRNAs and took efforts to identify mlncRNA-regu- 
lated protein-coding genes. These results enhance our 
knowledge of cardiac glycoside biosynthesis and provide 
useful information for understanding the function of 
mlncRNAs. 

Results 

High-throughput sequencing and assembly of D. 
purpurea transcriptome 

In order to explore the transcriptome of D. purpurea, we 
constructed a cDNA library of the leaves of one-year-old 
plants and sequenced it using the 454 ultra-high-through- 
put pyrosequencing platform, the Genome Sequencer FLX 
System. We obtained a total of 66103 high-quality reads 
[GenBank SRA database series identifier SRX084241] with 
an average length of 402 bases. Pre-processing and de 
novo assembly using the GS De Novo Assembler Software 
clustered 51056 reads (77.2% of the total high-quality 
reads) into 8413 contigs [GenBank TSA database acces- 
sion nos. JO460003-JO467642] and left 15119 singletons, 
yielding a total of 23532 unigenes. Original reads of 



contigs ranged from 2 to 268 with an average 6, and 
lengths ranged from 94 bp to 3927 bp with an average 519 
bp. The size distribution showed that most of contigs ran- 
ged from 300 bp to 600 bp and the percentage of contigs 
no less than 300 bp was 76% (Figure 1). These data sug- 
gested that most of the pyrosequencing data had been suc- 
cessfully assembled into relatively long contigs. 

Functional annotation and gene ontology analysis 

Functional annotation of unigenes was carried out by 
sequence similarity searches against the NCBI non- 
redundant (Nr) protein database using the Basic Local 
Alignment Search Tool (BLAST) [58]. We applied a gen- 
erous e-value cutoff of 10~ 5 to the BLAST homolog 
recognition. 15626 (66.4%) of total unigenes were anno- 
tated according to the proposed function of the best 
match for each unigene in the Nr protein database. 
These annotated unigenes were protein-coding genes 
conserved in other species. The remaining 7906 unigenes 
(33.6%), which did not show significant similarity to any 
sequences in the Nr protein database, could be npcRNAs, 
UTR regions of known protein-coding genes, or novel 
protein-coding transcripts. 

The D. purpurea unigenes were further analyzed and 
categorized using the Gene Ontology (GO), which had 
been widely used to standardize the representation of gene 
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Figure 1 Size distribution of contigs assembled from 454 pyrosequencing data. The number of contigs with same length (A) and the 
number of contigs with length less than 300 bp, 300-600 bp and larger than 600 bp (B) are shown. 
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and gene product attributes across species and to describe 
gene products in terms of their associated biological pro- 
cesses, cellular locations and molecular functions in a spe- 
cies-independent manner [59]. GO analysis annotated a 
total of 6462 unigenes and assigned one or more GO 
terms to each of them (Figure 2). The category of molecu- 
lar function, which included 5750 unigenes, was the lar- 
gest, followed by the biological process category (4607 
unigenes) and the cellular location category (2144 uni- 
genes). GO assignments of 29 subcategories of the three 
categories were shown in Figure 2. Metabolic process, pro- 
tein binding, cellular process and catalytic activity were the 
largest four subcategories. 

Identification of protein-coding genes involved in cardiac 
glycoside biosynthesis 

D. purpurea is a useful medicinal plant species in heart 
failure treatment [1]. Isolation and characterization of 
genes involved in the biosynthesis of secondary metabo- 
lites, such as cardiac glycosides, is particularly valuable. 
However, little research had been done on this 
area. With the aim of identifying cardiac glycoside 



biosynthesis-associated genes on a transcriptome-wide 
scale, we first analyzed the 23532 D. purpurea unigenes 
by sequence similarity searches against the Kyoto Ency- 
clopedia of Genes and Genomes database (KEGG) and 
then searched for unigenes associated with metabolic 
pathways [60]. Using an E-value cutoff of 10" 5 , we com- 
putationally annotated 12921 unigenes (54.9%), of which 
1400 were assigned to the KEGG metabolic pathway 
category, suggesting a number of unigenes associated 
with metabolic pathways of D. purpurea were obtained. 
Within the metabolic pathway category, unigenes were 
separated into 10 subcategories (Figure 3A). 

Further analysis of unigene assignment in the subcate- 
gories showed that 797 of the 1400 unigenes each were 
classified to a subcategory, while the other 603 each were 
assigned to two or more subcategories. Among the 10 sub- 
categories of metabolic pathway, carbohydrate metabolism 
was assigned the largest amount of unigenes (Figure 3A), 
followed by amino acid metabolism, energy metabolism, 
lipid metabolism, biosynthesis of secondary metabolites, 
and the others (Figure 3A). The biosynthesis of steroids 
and terpenoids were well-represented by unigenes in the 
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Figure 2 Gene ontology analysis of D. purpurea unigenes. Percent of unigenes assigned to 29 subcategories of the cellular location, 
molecular function and biological process categories is shown. 
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□ Amino acid metabolism 

■ Biosynthesis of polyketides and nonribosomal peptides 

□ Biosynthesis of secondary metabolites 

□ Carbohydrate metabolism 

■ Energy metabolism 

□ Glycan biosynthesis and metabolism 

■ Lipid metabolism 

□ Metabolism of cofactors and vitamins 

■ Nucleotide metabolism 

■ Xenobiotics biodegradation and metabolism 

□ Androgen and estrogen metabolism 

■ Arachidonic acid metabolism 

□ Bile acid biosynthesis 

□ Biosynthesis of steroids 

■ Biosynthesis of unsaturated fattyacids 

□ Ether lipid metabolism 

■ Fatty acid biosynthesis 

□ Fatty acid elongation in mitochondria 

■ Fatty acid metabolism 

■ Glycerolipid metabolism 

□ Glycerophospholipid metabolism 

□ Linoleic acid metabolism 

■ Sphingolipid metabolism 

■ Synthesis and degradation of ketonebodies 




□ Alkaloid biosynthesis I 

■ Alkaloid biosynthesis II 

□ Brassinosteroid biosynthesis 

□ Caffeinemetabolism 

■ Carotenoid biosynthesis 

□ Diterpenoid biosynthesis 

■ Flavone andflavonol biosynthesis 

□ Limonene and pinene degradation 

■ Novobiocinbiosynthesis 

■ Phenylpropanoidbiosynthesis 

□ Streptomycinbiosynthesis 

□ Terpenoid biosynthesis 

■ Tetracycline biosynthesis 

■ Zeatinbio synthesis 



■ alpha-Linolenic acid metabolism 

Figure 3 Percentages of D. purpurea unigenes in 10 subcategories of the metabolic pathway category (A), 15 sub-sub-categories of 
the lipid metabolism subcategory (B), and 14 sub-sub-categories of the biosynthesis of secondary metabolites subcategory (C). 



subcategories of lipid metabolism and biosynthesis of 
secondary metabolites, allowing the effective identification 
of cardiac glycoside biosynthesis-related genes in 
D. purpurea (Figure 3B, C). 

Digitalis cardiac glycosides are a class of plant secondary 
metabolites consisting of a steroid nucleus and a sugar 
side chain varied in length [61]. The putative biosynthetic 
pathway of plant cardiac glycosides roughly comprises ter- 
penoid backbone biosynthesis, steroid biosynthesis and 
cardenolide biosynthesis three stages, involving about 40 
gene families, of which only a few had been characterized 
in D. purpurea. Examination of unigenes assigned to the 
subcategories of lipid metabolism and biosynthesis of sec- 
ondary metabolites allowed us to identify a total of 140 
unigenes that could be associated with the biosynthesis of 
terpenoid backbone, steroids and cardenolides (see Addi- 
tional file 1). These unigenes were grouped into 30 gene 
families, representing about 75% of the total cardiac glyco- 
side biosynthesis-associated gene families. 

Among the 140 unigenes, 24 assembled from 103 origi- 
nal 454 reads were involved in the biosynthesis of 



terpenoid backbone, and represented 13 of the 16 terpe- 
noid backbone biosynthesis-associated gene families (see 
Additional file 1). Twenty unigenes assembled from 118 
original 454 reads were identified to represent 12 of the 13 
putative steroid biosynthesis-associated gene families, sug- 
gesting genes involved in the steroid biosynthesis were 
also well-represented in the unigene set (see Additional 
file 1). 

Different from that involved in the biosynthesis of ter- 
penoid backbone and steroids, genes associated with car- 
denolide biosynthesis were not well-represented. The 
putative pathway of cardenolide biosynthesis involved at 
least 11 gene families, of which only 5 were represented 
by unigenes. It includes those encoding A 5 -3|3-hydroxys- 
teroid dehydrogenases (3P-HSD), progesterone 5p -reduc- 
tases (5P-POR), mono-oxygenases, cardenolide 16'-0- 
glucohydrolases (CGH), and glycosyltransferases (GT) 
(see Additional file 1). 3/3-HSD and 5/3-POR and mono- 
oxygenase genes are involved in the initial steps of carde- 
nolide biosynthesis toward cardenolide genin formation, 
GT genes are responsible for addition of sugars to the 
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genins, whereas CGH is involved in the degradation of 
primary cardiac glycosides. The gene families of mono- 
oxygenases and GTs are usually super-large in plants and 
play crucial roles in various cellular and metabolic pro- 
cesses. Consistently, in the 454 EST database, mono- 
oxygenases were represented by 36 unigenes assembled 
from 100 original 454 reads, while GTs were represented 
by 56 unigenes from 133 reads. The exact mono-oxyge- 
nase and GT genes involved in cardenolide biosynthesis 
remained to elucidate. It is worthy to note that the unigene 
encoding CGH was listed among the 25 most abundant 
unigenes (see Additional file 2). Low representativeness of 
unigenes for cardenolide biosynthesis-associated gene 
families in our 454 EST database and high expression of 
the gene involved in primary cardiac glycoside hydrolysis 
were consistent with overall low cardiac glycoside content 
in D. purpurea leaves, although relative to other organs 
leaves have the highest content of cardiac glycosides [62]. 
These data greatly enhanced our knowledge of Digitalis 
cardiac glycoside biosynthesis and provided candidate 
genes for improving the production of active medicinal 
compounds in D. purpurea through genetic engineering. 

Transcriptome-wide identification of D. purpurea 
mlncRNA candidates 

Although npcRNA had been found in various organisms, 
systematically searches of plant mlncRNAs were carried 
out only in A. thaliana, wheat and M. truncatula [39-46]. 
In D. purpurea, among the 23532 unigenes analyzed, 7906 
(33.6%) could be novel protein-coding transcripts or 
npcRNAs because they did not show significant similarity 
to any sequences in the Nr protein database. Among the 
7906 unigenes, 4010, which were not further analyzed, 
were less than 300 bp in length. The other 3896 with sizes 
at least 300 bp were subjected to detailed characterization 
using the pipeline summarized in Figure 4. 

The pipeline we used is similar to those reported pre- 
viously for systematic screens of mlncRNAs from the 
EST databases oi Arabia 1 op sis [39,40,42], Medicago [45], 
Sus scrofa [63], Drosophila [64], and human [65]. A cutoff 
of 100 amino acids was applied in this study to distin- 
guish npcRNAs from protein-coding transcripts as sug- 
gested by Rymarquis et al (2008). The results showed 
that 1215 of the 3896 unigenes with sizes at least 300 bp 
had the potential to encode proteins, while the other 
2681 unigenes were npcRNA candidates. It includes 2660 
mlncRNAs and 21 housekeeping npcRNAs (a tRNA pre- 
cursor and 20 snoRNA precursors) (see Additional file 
3). Low number of non-polyadenylated tRNA and snoR- 
NAs are consistent with our experimental strategy to 
select polyadenylated RNAs for library construction. 
Being consistent with the size distribution of contigs, 
most of the identified mlncRNA candidates have sizes 
ranged between 300 bp and 600 bp (Figure 5). The others 
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Figure 4 Schematic outline of novel protein-coding genes and 
mlncRNA candidates. The number of unigenes is shown in 
parentheses. ORF, open reading frame. 



include 22 with 600-699 bp in length and 30 over 700 bp 
(Table 1). The long ones, such as those with sizes over 
600 bp, are very likely to be authentic mlncRNAs, 
whereas we cannot rule out the possibility that some of 
the short ones are UTRs of protein coding genes or con- 
tain partial ORFs and UTRs of protein coding genes. 
Further experimental cloning of full-length transcripts 
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Figure 5 Size distribution of mlncRNA candidates. 
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Table 1 D. purpurea mlncRNA candidates with sizes over 
700 bp 

npcRNA name Unigene ID Length (bp) number of 454 reads 
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may help verify those short mlncRNA candidates. These 
results suggest the existence of a large number of 
mlncRNAs in D. purpurea. 

Identification of 13 D. purpurea microRNAs 

microRNAs (miRNAs) are a class of small noncoding 
RNAs derived from long primary miRNAs and play crucial 
roles in organ development and defense responses [35-38]. 
Many miRNAs are deeply conserved across species 
boundaries. It allows us to identify D. purpurea miRNAs 
by computational search of our unigene set based on the 
conservation of miRNA sequences and the secondary 
structure of unigenes [66]. We, therefore, searched the 
unigene set for sequences similar to the miRNAs included 
in the most recent release of miRBase (release 17, http:// 
www.mirbase.org/) [67] using the BLASTN program [58], 
and then performed mfold analysis of hairpin structures 
for those unigenes with regions no more than three mis- 
matches to known miRNAs [68]. As a result, we identified 



a total of 13 miRNA-producing unigenes, which counted 
for 0.06% of total D. purpurea unigenes (Table 2). Except 
FXAT9O005F5R10, which is 209 bp in length, the other 
12 miRNA-producing unigenes are mlncRNAs with sizes 
at least 300 bp. It includes JO460086 (mlncR9), one of the 
mlncRNAs with sizes over 700 bp (Table 1). Based on the 
predicted miRNA sequences, these unigenes were classi- 
fied into eight families, of which four, including MIR 160, 
MIR396, MIR397 and MIR408, were each represented by a 
single member, while the others, including MIR1S6, 
MIR 166, MIR 167 and MIR 172, each had two or three 
members identified (Figure 6 and Table 2). In absence of a 
small RNA database for D. purpurea, we are not able to 
reveal those mlncRNAs that produce species-specific miR- 
NAs, but even so, the ratio (-0.06%) of miRNA-producing 
unigenes to total D. purpurea transcripts is much higher 
than the ratios (-0.01%) reported previously for other 
plant species, suggesting the importance of miRNAs in 
D. purpurea plants [66]. 

Identification of 25 D. purpurea miRNA targets 

The regulatory role of plant miRNAs was achieved 
mainly through targeting other RNA molecules for clea- 
vage [35-38]. Perfect or near-perfect complementarities 
were found between plant miRNAs and their targets, 
allowing an effective prediction of target sequences 
through computation [69]. A useful web server for pre- 
diction of plant miRNA targets, known as psRNATarget, 
has been developed [70]. It searches potential miRNA 
targets using an improved iterative parallel Smith- 
Waterman algorithm and a weighted scoring schema 
[71]. Penalty scores were calculated for mismatched pat- 
terns in the miRNA:mRNA duplexes within a 20-base 
sequence window. Using psRNATarget and setting the 
penalty score cutoff threshold 0-3, we identified a total 
of 25 targets from our unigene set for seven D. pur- 
purea miRNA families, which are MIR156, MIR160, 
MIR166, MIR167, MIR172, MIR396, and MIR408 (Table 
3). The number of targets for each miRNA families 
range between one and seven. No targets were predicted 
for MIR397, indicating a very low expression level of 
miR397 targets in D. purpurea leaves. 

As expected, some of the predicted targets are deeply 
conserved among various plant species. It includes two 
auxin response factor (ARF) genes regulated by miR160, 
an APETALA2 (AP2) gene targeted by miR172, and a 
growth-regulating factor (GRF) gene cleaved by miR396 
(Table 3). All of them encode transcription factors that are 
important to plant development, suggesting the conserved 
roles of miRNAs in different plant species. Twenty one of 
the 25 predicted targets appear to be not conserved targets 
of miRNAs. It is consistent with the previous results show- 
ing species-specific targets for many conserved miRNAs in 
other plant species, such as Populus trichocarpa and Pinus 
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Table 2 Identification of 13 D. purpurea miRNAs 



miRNA gene 


miRNA sequence 


Unigene ID 


MFEI* 


Length (bp) 


454 reads 


MIR156a 


UGACAGAAGAGAGGGAGCAC 


J0463945 


0.99 


494 


6 


MIR1 56b 


UGACAGAAGAGAGUGAGCAC 


FXAT9O005F5R10 


1 .1 1 


209 


1 


MIR160 


UGCCUGGCUCCUUGUAUGCCA 


FXAT9O005F2EIB 


1.18 


425 


1 


MIR166a 


UCGGACCAGGCUUCAUUCCUC 


J 0467 104 


0.82 


356 


5 


MIR166b 


UCGGACCAGGCUUCAUUCCCC 


J0462464 


0.74 


595 


6 


MIR167a 


UGAAGCUGCCAGCAUGAUCUA 


J0465767 


1.12 


379 


3 


MIR167b 


UGAAGCUGCCAGCAUGAUCUA 


FXAT9O005F011M 


1.07 


518 


1 


MIR167C 


UGAAGCUGCCAGCAUGAUCUG 


FXAT9O005F5PKB 


0.85 


300 


1 


MIR172a 


AGAAUCUUGAUGAUGCUGCAU 


J0462274 


1.25 


610 


4 


MIR172b 


AGAAUCUUGAUGAUGCUGCAU 


FXAT9O005GBAO8 


1.09 


399 


1 


MIR396 


UUCCACAGCUUUCUUGAACUG 


FXAT9O005FNTBE 


1.02 


398 


1 


MIR397 


CCAUUGAGUGCAGCGUUGAUG 


FXAT9O005F4N47 


1.33 


438 


1 


MIR408 


CUGCACUGCCUCUUCCCUGGC 


JO460086 


0.90 


1095 


45 



* MFEI, minimal folding free energy index of the hairpin structures. 



MIR156a (AG=-44.6) 

A A- AAU AUU 

5'- GGGAUU UGACAGAAG GAGGGAGCACAGGCAGGGA UGUAUAGAG A 

I I I I I I I I I I I I I I I I I I I I I I I I I III II I I I I I I I I I 
3'- CCUUAA ACUGUCUUC UUCCCUAGUGUGCGUAUUU GCAUAUUUU / 

A UCG CC GUG 



\1IR156b (AG= -43.4) 

A - GCAGGCUAAUU UGUU 

5' - GUGUGACAG AAGA GAGUGAGCACAC GUAUACG U 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

3' - CACACUGUC UUCU UUCACUCGUGUG CAUAUGU C 

A U G UUUC 



MIRI60 (AG= -52.9) 



C AAU 
5'- AUGCCUGGCUCCUUGUAUGCCAUUCAGCAAGUACAUC A 

I I I I I I I I I I I I I I I I I I I I I I I I I III I I I I 
3'- UACGGACCGAGGACCGUGCGGUAGGUGUUUCCGGUGG / 

A ACG 



MIR166a (AG= -44.2) 

U UU - AAC GCA- AAU 

5'- GUGUUUGUGGGGGAUG GG CUGGUGCGACGC CACGUUUUU GUU U 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III 
3' - CCCAAACUCUCCUUAC UC GACCAGGCUACG GUGCAAAAA UAG C 

G U- G GUA AAGA AGU 



MIR166b(\G=-42.3) 

U GGC AGAUCUAUAUGUCAAAUCUUGAUCUAAU 

5'- UGAGGGGACUGUUGUCUGGCCUGA CACCAACA C 

II I I I I I II I I I I I I I I I I III I I I I 
3'- ACCCCCCUUACUUCGGACCAGGCU GUGAUUGU / 

A GCA GUACUAGAUUGACAUUGAUCUAAUGUUU 



MIR167a(AG= -43.6) 

UGU G AAA UU 

5' -UAGCC UC UGAAGCUGCCA CAUGAUCU CUUUCCUUU U 

I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
3' -GUUGG AG ACUUCGAUGGU GUACUAGA GAAAGGAGA U 
UCUU CCU - CAA UG 



MIR167b(AG= -55.4) 



U — - GGA 

5'- CUAGAAAGGAACAAGUGAAGCUGC CAGCAUGAUCUAUCUUGGGAUGGU UUU □ 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II III I I I I III 
3' - GGUUUCUCCUUCUUCACUACGACG GUCGUACUGGAUCGAUCCCAAUCA GAG A 

G AU U AGU 



MIR167c(AG= -55.3) 

U C ACA CAUC CAAAAUGAUUCCUACUUGCUAAUUAUGGAA 

5'- GAGAGGUUGAAGCUGCCAG AUGAUCUGACCAUU GACUU AUCUCCAUUCUUC A 

III I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I II I I I I I I I I I 

3' - CUCACUAACUUCGACGUUC UACUGGACUAGUAA UUGAA UAUUGGUAAGAAG U 

U - ACUUUGAACUACUAAUCUACUAGUUU AC CC 



MIR172a (AG= -47.6) 

A GAA- UA U 

5' -AGUCGUUU UUAUCUGGUGUGGUAUCAUCAAGAUUCACAU ACC GUUGU 

I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I III III I I I I 
3' -UCAGUAAA AAAAGGCUACGUCGUAGUAGUUCUAAGAGUA UGG UAAC / 

GUAG — U 



MIR172b(AG= -37.2) 

U AG-- AAU 

5' - CUGGCGUGGCAUCAUCAAGAUUCACAUGA CU U 

III I I I I I I I I I I I I I I I I I I I I I II 
3' - GGCUACGUCGUAGUAGUUCUAAGAGUACU GA C 

C AGUA AGU 



MIR396(\G=-48.0) 

AC — AGA GUA AG 

5' - UCUUCUUAU UUUUCCACAGCUUUCUUGAACUGC UCAUC UCGG U 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
3'- GGGAGAGUA AGAGGGUGUCGAAGUGACUUGGCG AGUAG GGUC A 

CU AU C-- A-- AA 



MIR397 (AG= -49.3) 

u uuc ucuuu 

5' - UAAAAACGCCAUUGAGUGCAGCGUUGAUGAUAUUGCAUUUG GAUC A 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III III I I I I 
3' - AUUUUUGCAAUAAUUCACGUCGUAACUGCUAUAUUGUUAGC UUAG / 
A UAU UCCUG 



MIR408(AG= -46.0) 

AA G 

5' -GGAGAGACGGGGACGAGACAGAGCAUGGCAAGGG AAAGAAAAUG A 

I I I I I I I I I I I I III III III I I II III I I I I I I I I I 
3 ' -CCUCUCGGUCCCUUCUCCGUCACGUCCCUUUCCC UUUCGUUUAU C 

G- U 

Figure 6 Predicted hairpin structures of D. purpurea miRNA precursors. Mature miRNA sequences are indicated in red. The red, green and 

blue vertical lines indicate G:C, G:U and A:U pairing, respectively, 
k J 
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Table 3 Identification of miRIMA targets in D. purpurea 



miRNA family 


Function of targets 


Unigene ID of targets (penalty score) 


Length (bp) 


MIR156 


unknown 


FXAT9O005F20XA (0.5) 


369 






FXAT9O005GEKYB (3) 


296 




PRLHnteracting factor L 


FXAT9O005FO94E (3) 


427 


MIR160 


ARF 


JO460174 (1.5) 


932 






FXAT9O005F7ZVB (2.5) 


460 




glyoxalase I 


JO466078 (3) 


623 


MIR166 


unknown 


FXAT9O005F0WN0 (2.5) 


441 




VQ-motif containing protein 


FX AT9O005 FXWFT (3) 


506 


MIR167 


unknown 


FXAT9O005F7P6T (3) 


463 


MIR172 


AP2 


FXAT9O005FMQGW (1.5) 


376 




serine/threonine kinase 


FXAT9O005F9IN5 (2.5) 


464 




hydrolase 


FXAT9O005FV544 (3) 


195 






FXAT9O005F4DOL (3) 


489 




sec14 cytosolic factor 


JO460887 (3) 


1402 




unknown 


FXAT9O005F2EA1 (3) 


534 






JO465008 (3) 


479 


MIR408 


KH domain-containing protein 


FXAT9O005F4K0N (3) 


319 






JO464508(3) 


1292 


MIR396 


GRF 


FXAT9O005FZB6N (2) 


445 




global transcription factor 


FXAT9O005FZLA6 (3) 


453 




dead box ATP-dependent RNA 


J0461297 (3) 


703 




helicase 


J0464891 (3) 


398 




Heat shock protein 


FXAT9O005F8VKM (2.5) 


336 




unknown 


FXAT9O005F0GS0 (3) 


438 






FXAT9O005FSKAG (3) 


373 



taeda [23,72,73]. Among the twenty one targets, twelve are 
involved in metabolism, RNA process, transcriptional reg- 
ulation and signal transduction, while the other nine are 
function- unknown, indicating possible new roles of miR- 
NAs in D. purpurea (Table 3). Computational comparison 
of gene functions showed that many of these species-spe- 
cific targets could be important in plant responses to bio- 
tic and abiotic stresses. For instance, the glyoxalase I gene 
targeted by miR160 has 75% identity with a member of 
Arabidopsis calmodulin-binding proteins, Atlg08110, 
which is involved in plant responses to phytoprostane 
treatment [74,75]. The two KH domain-containing pro- 
teins regulated by miR408 show at least 75% identity with 
At5g56140 that is involved in light responses in Arabidop- 
sis plants growing at low temperature [76]. The dead box 
ATP-dependent RNA helicase gene targted by miR396 is a 
member of a large gene family involved in defense 
responses against pathogen infection and various abiotic 
stresses [77-80]. The heat shock protein encoded by con- 
tig05310, a predicted target of miR396, shows 91% identity 
to Arabidopsis HSP90.1 and HSP81 that are involved in 
various biotic and abiotic stresses [81,82]. These results 
suggested the roles of miRNAs in the development and 
defense responses of D. purpurea plants. Further experi- 
mental validation of the predicted miRNA targets may 



help add new insights into the regulatory mechanisms of 
miRNAs in D. purpurea. 

Indentification of mlncRNA families based on sequence 
homology 

Among the 2660 D. purpurea mlncRNA candidates with 
sizes at least 300 bp, 12 produce miRNAs (Table 2), while 
functions of the other 2648 mlncRNAs are poorly under- 
stood. As a first step toward elucidating the roles of these 
mlncRNAs, we searched sequence homology among them 
using BLASTN [58]. By applying an e-value cutoff of 10" 5 
to the homolog recognition, we revealed that 320 of the 
2648 mlncRNAs could be grouped into 140 families (see 
Additional file 4). Number of members for these 
mlncRNA families is between 2 and 8. The existence of 
mlncRNA families with multiple members could be a first 
indication that they actually are important in D. purpurea. 
No homologs were found for the other 2328 mlncRNAs, 
indicating that most of the mlncRNAs could be single 
copy. These results provide a basis for further demonstrat- 
ing mlncRNA functions. 

Conservation of D. purpurea mlncRNAs 

The conservation of mlncRNAs was analyzed by search- 
ing D. purpurea mlncRNA candidates against the 
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NONCODE database of known npcRNAs using the 
BLASTN program [58,83]. An e-value cutoff of 10" 5 was 
applied. Among the 2648 non-microRNA-producing 
mlncRNA candidates, only eight were found to be con- 
served (see Additional file 5). It suggests the majority of 
D. purpurea mlncRNAs are species-specific. The low 
degree of evolutionary constraint of mlncRNAs was also 
found in other plant species including M. truncatula, 
Arabidopsis, wheat, and animals such as Drosophila and 
mouse, indicating low conservation of mlncRNAs is a 
common phenomenon in organisms [42,45,46,64,84]. 
Among the eight conserved mlncRNAs, two are homo- 
logues of GUT 15, an npcRNA probably involved in hor- 
mone response in Nicotiana tabacum and Arabidopsis 
[85,86]. Functions of the other six conserved mlncRNAs 
are currently unknown. 

Tissue-specific expression of mlncRNAs in D. purpurea 

Except miR408 -producing npcR9 (JO460086), expression 
patterns of the other 29 mlncRNAs with sizes over 700 
bp were analyzed in one-year-old, greenhouse grown D. 
purpurea plants by quantitative real-time RT-PCR 
(qPCR). Ubiquitin gene was selected as a reference since 



it showed stable expression in the D. purpurea tissues 
analyzed compared with actin and 18S rRNA (see Addi- 
tional file 6) [87]. Among the 29 mlncRNAs analyzed, 
two, including mlncR8 and mlncRll, were undetected, 
indicating low levels in the tissues analyzed. The other 
27 were found to be expressed in at least one tissue 
(Figure 7). mlncRl, mlncR4, mlncRlO, mlncR23, 
mlncR29 and mlncR30 are abundant in leaves, while 
mlncR17 and mlncR24 are expressed mainly in roots. 
The levels of mlncR2, mlncR3, mlncR6, mlncR14 and 
mlncR26 are higher in leaves and roots than stems and 
flowers, while mlncR15, mlncR18 and mlncR28 exhibit 
high expression in leaves and stems compared with that 
in flowers and roots. Differential expression was also 
found for the other mlncRNAs (Figure 7). It suggests 
the level of mlncRNAs is developmentally regulated and 
indicates the importance of mlncRNAs in D. purpurea 
growth and development. 

Identification of 24 cold- and 27 dehydration-responsive 
mlncRNAs 

D. purpurea plants tolerate various environmental stres- 
ses [1,8,9]. To address whether mlncRNAs are involved 
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Figure 7 Expression of mlncRNAs in leaves (L) 7 stems (S), flowers (F) and roots (R) of one-year-old D. 

mlncRNA levels were shown. Expression levels were quantified by qPCR. Ubiquitin was used as a reference gei 
were arbitrarily set to 1. Error bars represent the standard deviations of three PCR replicates. 
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in plant response to cold and dehydration stresses, we 
analyzed the levels of 29 non-miRNA-producing 
mlncRNAs with sizes over 700 bp in D. purpurea plant- 
lets treated with cold and dehydration for 1, 5, 10, 24 
hours and compared them with the levels in untreated 
plantlets. We were able to detect 27 mlncRNAs in 
plantlets, except mlncR7, an mlncRNA detected in 
leaves, stems and flowers of one-year old mature plants, 
and mlncRll, which was also undetected in the ana- 
lyzed tissues of mature plants (Figure 7, 8 and 9). D. 
purpurea mlncR7 exhibited opposite expression patterns 
with mlncR8, which expressed in plantlets but not in 
the analyzed tissues of one-year-old plants. It suggests 
that the expression of both mlncR7 and mlncR8 is 
developmentally regulated, while their biological func- 
tions are distinct. D. purpurea mlncRll appears to play 
a temporary role in D. purpurea because it is undetected 
in all of the tissues analyzed. 

Among the 27 mlncRNAs expressed in plantlets, 24 
and 27 showed more than 2-fold changes between at 
least two time-points in plantlets treated with cold and 
dehydration, respectively, and thus was considered as 
cold- or dehydration-responsive mlncRNAs (Figure 8 
and 9). It includes 3 responsive to dehydration only and 



24 responsive to cold and dehydration, suggesting all of 
the mlncRNAs analyzed are stress-responsive and the 
majority is not only cold-responsive but also dehydra- 
tion-responsive. It demonstrates the importance of 
mlncRNAs in stress responses and indicates the exis- 
tence of a crosstalk among mlncRNAs in response to 
cold and dehydration stresses. The results of almost all 
analyzed mlncRNAs to be cold- and/or dehydration- 
responsive indicate that the plant materials were prob- 
ably exposed to environmental stresses before being 
used for EST library construction. 

Based on the expression patterns in response to cold 
stress, mlncRNAs can be roughly categorized into three 
major types (Figure 8). Type A mlncRNAs, including 
mlncRl, mlncR2 and mlncR15, showed less than 2-fold 
changes between any two time-points. These mlncRNAs 
appears to be not responsive to cold. Type B, such as 
mlncR17 and mlncR21, exhibited an immediate decrease 
after treatment for 1 hour. The decrease continued for 
at least 10 hours and then some of them, such as 
mlncR21, showed a trend of recovering to the level in 
untreated tissues. By contrast, type C mlncRNAs, 
accounting for more than 80% or 22 of the total 27 
mlncRNAs, showed a rapid increase after treatment. 
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Figure 8 Expression of mlncRNAs in D. purpurea plantlets treated with cold stress for 0, 1, 5, 10 and 24 hours. Fold changes of 
mlncRNA levels were shown. Expression levels were quantified by qPCR. Ubiquitin was used as a reference gene. The mlncRNA levels in leaves 
were arbitrarily set to 1. Error bars represent the standard deviations of three PCR replicates. 
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Figure 9 Expression of mlncRNAs in D. purpurea plantlets treated with dehydration stress for 0, 1, 5, 10 and 24 hours. Fold changes of 
mlncRNA levels were shown. Expression levels were quantified by qPCR. Ubiquitin was used as a reference gene. The mlncRNA levels in leaves 
were arbitrarily set to 1. Error bars represent the standard deviations of three PCR replicates. 



The highest level was reached, for most type C mlncRNAs, 
at 1 h after stress, or in a few cases, at 5 (mlncR30) or 10 
hours (mlncRIO). After reaching to a maximum, the 
mlncRNA levels quickly declined to near, or in most cases, 
far below the levels in untreated tissues. Rapid changes of 
expression level indicate the importance of type B and 
type C mlncRNAs in cold-stress response, although the 
underlying biological functions can be distinct between 
type B and type C mlncRNAs. 

Similarly, based on the expression patterns in response 
to dehydration stress, mlncRNAs can also be roughly clas- 
sified into 3 groups (Figure 9). The level of group I 
mlncRNAs, such as mlncR22, kept constant within the 
first 10 hours of dehydration stress and then increased at 
24 hours, showing a relatively slow response. Group II and 
group III mlncRNAs showed much quicker responses to 
dehydration stress. The expression pattern of group II 
mlncRNAs in response to dehydration is similar to type B 
mlncRNAs in response to cold stress, exhibiting a rapid 
decrease at 1 hour of stress (Figure 8 and 9). Except 
mlncR15, which showed an increase at 10 hours, down- 
regulation of most group II mlncRNAs lasted for at least 
24 hours. Group II, consisting of 20 mlncRNAs, is much 
larger than type B which includes only two, although 



group II and type B mlncRNAs exhibited similar expres- 
sion patterns in response to dehydration and cold stress, 
respectively. Group III consists of 6 mlncRNAs, including 
mlncR6, mlncR8, mlncR23, mlncR24, mlncR28 and 
mlncR30. Expression of mlncRNAs in this group was 
induced at 1 hour of stress and then immediately down- 
regulated to far below the level in untreated tissues. It is 
similar to the pattern of type C mlncRNAs in response to 
cold stress (Figure 8 and 9). The similarity of mlncRNA 
expression patterns in response to cold and dehydration 
suggests that the cold signaling network and the dehydra- 
tion signaling network are probably overlapped. 

Identification of 417 protein-coding genes with regions 
significantly homologous or complementary to 375 
mlncRNAs 

Genes with conserved sequences usually show structural, 
functional or evolutionary relationships. In order to eluci- 
date the relationship between mlncRNAs and protein- 
coding genes in D. purpurea, we searched our unigene 
set for protein-coding genes with regions significantly 
homologous or complementary to mlncRNAs using 
BLASTN and applying a generous e-value cutoff of 10~ 5 
to the BLAST homolog recognition [58]. A total of 417 
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protein-coding genes were identified (see Additional 
file 7). They showed sequence similarity to 375 
mlncRNAs in coding sequences (CDS), untranslated 
regions (UTR) or CDS-UTR junctions. Among the 417 
protein-coding genes, 373 each had only one correspond- 
ing mlncRNAs, while the others each had 2-9 hits, imply- 
ing protein-coding genes can be related to one or several 
mlncRNAs. Similarly, among the 375 mlncRNAs, 51 each 
showed sequence similarity to 2-10 protein-coding genes, 
indicating some mlncRNAs are related to multiple pro- 
tein-coding genes. The results demonstrate the complex 
relationship between mlncRNAs and protein-coding 
genes in D. purpurea. Among the 417 protein-coding 
genes, many encode known proteins involved in metabo- 
lism, signal transduction, gene regulation, stress 
response, protein folding and degradation, or nucleic acid 
and chromatin modification, indicating the importance of 
mlncRNAs in the growth and development of D. pur- 
purea plants (Table 4 see Additional file 7). 

Expression analysis of 4 protein-coding genes 
homologous or complementary to mlncRNAs 

Four mlncRNA/protein-coding gene pairs were selected 
for further analysis (Figure 10). It includes mlncRl 
(JO460015)/vacuolar H + -ATP synthase subunit E gene 
{VHA-E, FXAT9O005FSUW), mlncR6 (JO460006)/SNFl- 
related protein kinase gene (SnRK, J0461197), mlncR8 
(JO463019)/4-hydroxy-3-methylbut-2-en-l-yl diphosphate 
synthase gene (HDS, J0463639), and mlncR31 (FXA- 
T9O005F8J63)/solanesyl diphosphate synthase gene {SPS, 
JO463507). We first determined the transcriptional direc- 
tion of protein-coding genes based on the direction of 
annotated proteins, and then experimentally identified the 
transcriptional direction of mlncRNAs using 3' RACE. 
Results show that mlncRl and mlncR6 and mlncR8 have 
the same transcriptional direction as their corresponding 
protein-coding genes in the conserved regions, suggesting 
these mlncRNAs are homologous to the corresponding 
protein-coding genes. Among them, mlncRl and mlncR8 
share homologous sequences with the 5' UTR of VHA-E 
and the 3' UTR of HDS, respectively, while mlncR6 has a 
region highly similar to the CDS of SnRK. In contrast, 



mlncR31 appears to be complementary to the 5' UTR of 
SPS gene (Figure 10). 

We next analyzed the expression patterns of VHA-E, 
SnRK, HDS and SPS in response to cold and dehydration 
stresses and compared them with the corresponding 
mlncRNAs (Figure 11). In general, the expression of VHA- 
E, SnRK and HDS showed a positive correlation with the 
homologous mlncRNAs, mlncRl, mlncR6 and mlncR8, 
respectively, although variation was found at very early 
stage after treatment, such as 1 hour. The most conspicu- 
ous positive correlation of expression was found at 5, 10 
and 24 hours of stress. It suggests the existence of a close 
relationship between protein-coding genes and the homo- 
logous mlncRNAs and indicates that some protein-coding 
genes can be regulated by mlncRNAs based on the sense: 
sense interaction through a currently unknown mechan- 
ism or the protein-coding genes and mlncRNAs may be 
regulated by the same cis/trans factors. In contrast, the 
expression of SPS gene showed a negative correlation with 
mlncR31 after 5 hours of stress, being most noticeable in 
plantlets treated with cold stress (Figure 11). It indicates 
the presence of sense:antisense interaction between pro- 
tein-coding genes and the complementary mlncRNAs. 

Discussion 

Some mlncRNAs shown sense or antisense homology 
with protein-coding genes involved in secondary 
metabolism in D. purpurea 

Transcriptomic analysis is an effective approach to dis- 
cover genes particularly for those organisms without 
whole genome information, such as D. purpurea, an eco- 
nomically and ecologically important plant species. 
Through high-throughput 454 sequencing and subsequent 
assembly, we obtained a high-quality D. purpurea unigene 
set comprising a total of 23532 genes, which include 140 
putatively involved in the biosynthesis of terpenoids/car- 
diac glycosides, the most effective drugs to treat heart fail- 
ure during the past two centuries [1,4,5]. The 140 genes 
were grouped into 30 families, representing about 75% of 
total cardiac glycoside biosynthesis-associated gene 
families in D. purpurea. These sequence data greatly 
enhance our knowledge of cardiac glycoside biosynthesis 



Table 4 mlncRNAs shown sequence homology to secondary metabolism-related protein-coding genes 



mlncRNA ID 


Protein-coding 
gene ID 


Enzyme name 


Pathway involved 


Contig03310 
(mlncR8) 


J0463639 


4-Hydroxy-3-methylbut-2-en-1 -yl diphosphate 
synthase 


Terpenoid backbone biosynthesis 


FXAT9O005F8J63 
(mlncR31) 


JO463507 


Solanesyl diphosphate synthase 


Biosynthesis of isoprenoid side chain of ubiquinone and 
plastoquinone 


FXAT9O005FZSI2 


FXAT9O005FZ5UJ 


Dihydroflavonal-4-reductase 


Flavonoid biosynthesis 


FXAT9O005GD1W7 


FXAT9O005GCP3D 


Phytoene dehydrogenase 


Carotenoid biosynthesis 


FXAT9O005FYD4C 


FXAT9O005FK4AG 


Aromatic amino acid decarboxylase 


Alkaloid biosynthesis 
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Figure 10 Protein-coding genes homologous or complementary to mlncRNAs. Protein-coding genes are presented by heavy black lines 
(coding sequences) and thin lines (nontranslated regions). mlncRNAs are represented by thin lines. The transcriptional direction of protein- 
coding genes was determined based on the direction of annotated proteins, while that of mlncRNAs was analyzed by 3' RACE. VHA-E, SnRK and 
HDS are homologous to mlncRI, mlncR6 and mlncR8, respectively (A, B and C). SPS shares an antisense homology with mlncR31 (D). 
Homologous or complementary regions are shaded in grey and the nucleotide positions are shown. Sequence identities and Watson-Crick 
pairing (vertical dashes) between protein-coding genes and the corresponding mlncRNAs are indicated. 
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Figure 11 Expression of mlncRI, mlncR6, mlncR8 and mlncR31 
and the corresponding protein-coding genes, VHA-E, SnRK, HDS 
and SPS, in D. purpurea plantlets treated with cold (left panel) 
and dehydration (right panel) stresses for 0, 1, 5, 10 and 24 
hours. Fold changes of expression levels were shown. The levels of 
mlncRNAs and protein-coding genes were quantified by qPCR. 
Ubiquitin was used as a reference gene. The levels of mlncRNAs and 
protein-coding genes in untreated tissues were arbitrarily set to 1. 
Error bars represent the standard deviations of three PCR replicates. 



and provides useful information for manipulating cardiac 
glycoside biosynthesis in D. purpurea through biotechnol- 
ogy. More importantly, we found that 5 mlncRNAs 
showed sense or antisense homology with 5 protein- 



coding genes involved in secondary metabolism, 
including one putatively involved in terpenoid bio- 
synthesis (Table 4). 

HDS is the penultimate enzyme of the seven-step DXP 
pathway, one of the two pathways generating the C5 
backbone of all terpenoids including the hormone GA 
and CGs (see Additional file 8). It catalyzes the conver- 
sion of 2-C-methyl-D-erythritol 2,4-cyclodiphosphate 
into l-hydroxy-2-methyl-2-butenyl 4-diphosphate and is 
encoded by a single gene in various plants, including 
Arabidopsis, Hevea brasiliensis and Ginkgo biloba 
[88-91]. Through sequence similarity search and subse- 
quent transcriptional direction analysis, we found that 
the 3' UTR of D. purpurea HDS gene contained a 90 
bp-region with 87% identities to an mlncRNA, mlncR8 
(Figure 10). The expression of HDS and mlncR8 was 
positively correlated at 5, 10 and 24 hours of cold and 
dehydration stresses (Figure 11). It indicates the exis- 
tence of close relationship between HDS and mlncR8 
through a currently unknown mechanism. 

SPS gene involved in the biosynthesis of ubiquinone 
and plastoquinone is related to mlncR31. It utilizes far- 
nesyl diphosphate (FPP) and geranylgeranyl diphosphate 
(GGPP) as substrates and is responsible for the bio- 
synthesis of isoprenoid side chain of ubiquinone and 
plastoquinone in Arabidopsis [92]. The 5' UTR of 
D. purpurea SPS gene harbors a 102 bp-region highly 
complementary to mlncR31 (Figure 10). The expression 
of SPS gene is negatively correlated with mlncR31, 
which is most noticeable in plantlets treated with cold 
stress for 5, 10 and 24 hours (Figure 11), indicating SPS 
gene is regulated by mlncR31 through RNA-mediated 
gene silencing in D. purpurea [93-95]. Both ubiquinone 
and plastoquinone are electron carriers. Plastoquinone is 
a member of the photosynthetic electron transport 
chain in chloroplast, while ubiquinone is a component 
of respiratory electron transport chain in mitochondria. 
Thus, mlncR31 may be important in maintaining the 
activities of plant cells under environmental stress 
conditions. 

In addition, mlncRNAs are probably involved in the bio- 
synthesis of flavonoid, carotenoid and alkaloid (Table 4). 
The mlncRNA, FXAT9O005FZSI2, share homology with a 
unigene (FXAT9O005FZ5UJ) encoding dihydroflavonal-4- 
reductase (DFR), which catalyze the reduction of dihydro- 
flavonol into flavan-3,4-diol (leucoanthocyanin), a key step 
in the biosynthesis of anthocyanins, proanthocyanidins and 
other flavonoids important for plant survival and human 
nutrition [96]. FXAT9O005GD1W7 appears the other 
mlncRNA involved in secondary metabolism. It shows 
sequence similarity to a unigene (FXAT9O005GCP3D) 
encoding phytoene dehydrogenase (PDS). PDS, which cata- 
lyzes the desaturation of phytoene into phytofluene, is a 
rate-limiting enzyme in carotenoid biosynthesis in plants 
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[97]. The mlncRNA probably involved in alkaloid biosynth- 
esis is FXAT9O005FYD4C, which shares homology with 
the coding region of aromatic amino acid decarboxylase 
(AADC) gene (FXAT9O005FK4AG). AADC catalyzes the 
decarboxylation of tyrosine into tyramine, a key step of 
plant alkaloid biosynthesis [98,99]. 

Taken together, these results strongly suggest the 
importance of mlncRNAs in secondary metabolism in 
D. purpurea, 

D. purpurea mlncRNAs exhibit species-specific 
characteristics and are important in plant development 
and stress responses 

Using a computational mlncRNA identification pipeline, 
we identified 2660 D. purpurea mlncRNA candidates, of 
which only about 12% can be grouped into families with 
at least two members in a family, while the others are 
single gene family members. Furthermore, the conserva- 
tion of mlncRNAs is very low. Searching the NONCODE 
database of known npcRNAs, we found that only about 
0.3% of the mlncRNAs analyzed was conserved, suggest- 
ing the vast majority of D. purpurea mlncRNAs are spe- 
cies-specific. These results are consistent with those 
obtained from other plant species including M. trunca- 
tula, Arabidopsis, wheat, and animals such as Drosophila 
and mouse [42,45,46,64,84], indicating mlncRNAs may 
be evolved with a low degree of constraint and many of 
them are probably undergoing frequent birth and death. 
It is particularly true for the mlncRNAs which produce 
non-conserved microRNA families represented by single 
genes [100]. 

In this study, we identified a total of 13 microRNA-pro- 
ducing mlncRNAs based on the conservation of miRNA 
sequences and the secondary structure of unigenes 
[66-68]. The mature microRNAs were classified into 
eight families, including MIR156, MIR160, MIR166, 
MIR167, MIR172, MIR396, MIR397 and MIR408 (Figure 
6, Table 2). They were predicted to regulate at least 25 
protein-coding genes, of which 4 encode conserved tran- 
scriptional factors, including two ARFs, an AP2 and a 
GRF (Table 3). ARFs play regulatory roles in the develop- 
ment of plant organs, such as seed, gynoecium, embryo, 
hypocotyl, root and petal in Arabidopsis [101-107]. AP2 
is involved in floral transition and floral development 
[108]. GRF control cell proliferation in Arabidopsis leaves 
[109,110]. It suggests the importance of microRNAs in 
the development of D. purpurea plants. Additionally, 
among the 417 protein-coding genes shown sense or 
antisense homology with mlncRNAs, several, such as 
J0464993, FXAT9O005FR21D, FXAT9O005GE5AF, 
FXAT9O005F8DZ7, FXAT9O005F4WF9 and FXA- 
T9O005GCSS5, encode transcription factors and other 
development-related proteins, indicating some non- 
microRNA-producing mlncRNAs are also probably 



involved in the development of D. purpurea plants (see 
Additional file 7). It is consistent with the findings show- 
ing the involvement of mlncRNA (Zm401) in pollen 
development in Zea mays and the regulatory role of 
mlncRNA (enod40) in symbiotic nitrogen-fixing nodule 
formation in legumes [49,111,112]. 

Expressional analysis of mlncRNAs allowed us to iden- 
tify 24 cold- and 27 dehydration-responsive mlncRNAs, 
suggesting the importance of mlncRNAs in stress 
responses in D. purpurea (Figure 8 and 9). In addition, the 
majority of mlncRNAs analyzed were responsive to both 
cold and dehydration and many of them exhibited similar 
expression patterns under cold and dehydration condi- 
tions, indicating the cold signaling network and the dehy- 
dration signaling network are probably overlapped. Stress 
responses of mlncRNAs have been reported previously in 
various plants, including Arabidopsis [39,44,113] and 
wheat [46]. For example, Arabidopsis IPS1 and M. trunca- 
tula Mt4 mlncRNAs are involved in phosphate starvation 
through fine-tuning the expression of miR399/PH02 gene 
[54,56,113]. Arabidopsis COOLAIR (cold induced long 
antisense intragenic RNA) and COLDAIR (COLD 
ASSISTED INTRONIC NONCODING RNA) play signifi- 
cant roles in cold response and are required for the verna- 
lization-mediated epigenetic repression of FLC [44,55]. 
Several wheat mlncRNAs are responsive to powdery mil- 
dew infection and/or heat stress [46]. Thus, mlncRNAs 
seem to be very important for the whole plant kingdom to 
survive in the stressful environments. 

Functional mechanisms of mlncRNAs in D. purpurea 

mlncRNAs have been found to interact with RNA, DNA 
and protein molecules [13,114,115]. A small portion of the 
identified D. purpurea mlncRNAs produce conserved 
microRNAs. Since no small RNA sequences are available 
and experimental setup does not allow us to detect the 
processed miRNAs, we are not able to identify those 
producing non-conserved microRNAs from the set of 
D. purpurea mlncRNA candidates. However, it is rational 
to believe the existence of non-conserved microRNA-pro- 
ducing D. purpurea mlncRNAs based on the increasing 
number of non-conserved microRNAs in the miRBase 
[67]. These mlncRNAs play crucial roles in organ develop- 
ment and defense responses through producing conserved 
or non-conserved microRNAs, which target other RNA 
molecules for cleavage, or, in a few cases, for translational 
repression [35-38]. Additionally, some mlncRNAs play 
regulatory roles in plants through generating siRNAs. 
These mlncRNAs can be antisense transcripts of protein- 
coding genes, ta-siRNA precursors, repeat- associated tran- 
scripts, or transcripts sharing antisense homology with 
protein-coding genes [116-118]. It is also possible that 
some mlncRNAs may function through inhibiting the 
action of microRNAs. For example, the Arabidopsis 



Wu et al. BMC Genomics 2012, 13:15 
http://www.biomedcentral.eom/1 471 -21 64/1 3/1 5 



Page 1 7 of 23 



mlncRNA, IPS1, contains a 24 nt region complementary 
to miR399, a microRNA involved in phosphate starvation 
by targeting PH02 gene for cleavage. IPS1 mimics PH02, 
except that the base-pairing is interrupted by a mis- 
matched loop at the miR399 cleavage. Therefore, IPS1 is 
not cleaved but instead sequesters miR399, resulting in 
inhibition of miR399 activity on PH02 [113]. The siRNA- 
generating mlncRNAs and the mimics of microRNA tar- 
gets in our mlncRNA set remain to be identified. 

In this study, we predicted a total of 2660 D. purpurea 
mlncRNA candidates using a computational mlncRNA 
identification pipeline. According to the sequence similar- 
ity between mlncRNAs and protein-coding genes and the 
location of homologous regions in mlncRNAs and pro- 
tein-coding genes, the majority of mlncRNAs candidates 
appear to be not the fragments of UTRs of protein-coding 
genes (Figure 11; see Additional file 7). Among the 2660 
mlncRNA candidates, 375 mlncRNAs share sense or anti- 
sense homology with 417 protein-coding genes, indicating 
that some mlncRNAs and the corresponding protein-cod- 
ing gene are probably homologous genes derived from a 
common ancestral gene, or these mlncRNAs are new born 
genes originated from the corresponding protein-coding 
genes, as the case of long non- coding RNA, Xist, and the 
protein-coding gene, Lnx3, in mammals [119,120]. The 
mlncRNAs sharing antisense homology with protein-cod- 
ing genes, such as D. purpurea mlncR31, seems to play a 
role in post-transcriptional regulation of the correspond- 
ing protein-coding genes. It is evidenced by the negative 
correlation of expression between mlncR31 and the SPS 
gene after 5 hours of stress (Figure 11). The roles in post- 
transcriptional regulation have been previously reported 
for some long non-coding RNAs identified from various 
animal species, such as human and mouse [13,93,95]. It is 
not well-known for the mechanisms of mlncRNAs acting 
on protein-coding genes with sense homology, such as 
mlncRl, mlncR6 and mlncR8 and their homologous pro- 
tein-coding genes, VHA-E, SnRK and HDS, respectively. 
One of the possible mechanisms is co-suppression, a phe- 
nomenon in plants in which a sense transgene sometimes 
cause suppression of both the endogenous gene and the 
transgene [121-123]. Further identification of the interac- 
tion mechanism between mlncRNAs and protein-coding 
genes will give us a much clearer picture of how 
mlncRNAs function in D. purpurea. 

Conclusions 

The comprehensive analysis of transcriptome data from 
high-throughput sequencing allow us to discover many 
novel unigenes involved in the biosynthesis of secondary 
metabolites, such as cardiac glycosides, ubiquinone, plasto- 
quinone, flavonoid, carotenoid and alkaloid, and to identify 
a large number of mlncRNA candidates in D. purpurea. 
Detailed analysis suggests that the majority of D. purpurea 



mlncRNAs are species-specific and most mlncRNA 
families are single gene families. These mlncRNAs exhib- 
ited tissue-specific expression and responded to cold and 
dehydration stresses. Since at least 24 mlncRNAs were 
found to be not only cold-responsive but also dehydra- 
tion-responsive, a crosstalk could exist among mlncRNAs 
in response to cold and dehydration stresses. The identi- 
fied mlncRNAs appears to be involved in many aspects, 
such as plant development, stress response and secondary 
metabolism. The regulatory role is associated with pro- 
tein-coding genes sharing sequence similarity with the 
mlncRNAs. These results provide novel and significant 
information for understanding the biosynthesis of second- 
ary metabolites and the function of mlncRNAs. In addi- 
tion, a large number of RNAseq data of D. purpurea have 
been recently released in the NCBI SRA database under 
the series identifier SRP006029, which provides an addi- 
tional source for the discovery of genes and mlncRNAs in 
D. purpurea. 

Methods 

Plant materials 

Leaves, stems, flowers and roots were collected from one- 
year-old, greenhouse grown D. purpurea 'GIANT SHIRLY' 
plants and then used for EST library construction and tis- 
sue-specific expression analysis of mlncRNAs. For cold 
and dehydration treatments, D. purpurea seeds were steri- 
lized in 70% ethanol for 1 min and then in 0.1% mercuric 
chloride for 8 min. After washing three times in sterile 
water, the seeds were grown on MS medium supplemen- 
ted with 3% sucrose and 0.9% agar for 3 weeks at 25°C 
under 14 hours light: 10 hours dark. The plantlets were 
treated at 4°C, or removed from the agar and then dehy- 
drated in glass dishes at 25°C under dim light, for 0, 1, 5, 
10, 24 hours. Six seedlings were collected in liquid nitro- 
gen for each example. 

EST library construction and 454 sequencing 

Total RNA was extracted from D. purpurea leaves using 
the Universal Plant Total RNA Extraction kit (Bioteke). 
mRNA was isolated from total RNA using the Oligotex 
mRNA kit (Qiagen). Single-stranded cDNA was prepared 
from mRNA using the SMART cDNA Synthesis kit (Clon- 
tech). Double-stranded cDNA was amplified by PCR poly- 
merase Advantage II (Clontech). PCR products less than 
300 bp in length were removed using the PureLink™ PCR 
Purification kit (Invitrogen). Pyrosequencing was per- 
formed by 454 Life Sciences using the Genome Sequencer 
FLX System. 

EST assembly, annotation and classification 

The GS-FLX Software (454 Life Sciences, Roche) was 
used for trimming adapters and filtering low-quantity 
sequences. High-quantity reads were assembled into 
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unigenes by the GS De Novo Assembler Software, an 
application of the GS FLX Software. The unigenes 
obtained were annotated by sequence similarity searches 
against the NCBI non-redundant (Nr) protein database 
using the Basic Local Alignment Search Tool (BLAST) 
[58]. A generous e-value cutoff of 1CT 5 was applied to 
the BLAST homolog recognition. Gene Ontology (GO) 
terms were assigned using InterProScan [59]. 

mlncRNA identification 

The procedure for computational mlncRNA identification 
is outlined in Figure 4. Unigenes without Nr annotation 
and with sizes at least 300 bp were selected for open-read- 
ing frame (ORF) prediction using ESTScan 2 [124]. Com- 
putational prediction was performed on a local server 
through batch analysis using the default parameters. A 
cutoff of 100 amino acids was applied to distinguish 
mlncRNAs from protein-coding transcripts as described 
[43]. Housekeeping npcRNAs were searched against Rfam 
10.0 (January 2010) using BLASTN [58,125]. Sequence 
homology among the D. purpurea mlncRNA candidates 
was analyzed using BLASTN [58]. The conservation of 
mlncRNAs was analyzed by searching D. purpurea 
mlncRNA candidates against the NONCODE database of 
known npcRNAs using the BLASTN program [58,83]. An 
e-value cutoff of 10~ 5 was applied to the homolog 
recognition. 

Analysis of conserved miRNAs 

Plant microRNA sequences were downloaded from miR- 
Base (Released 17, http://www.mirbase.org/) [67]. Uni- 
genes with regions no more than three mismatches to 
known microRNAs were searched against the D. purpurea 
unigene set using BLASTN [58]. A word size of 4 was 
used as described [126]. The secondary structures were 
predicted using mfold [68] . Criteria described by Meyers 
et al (2008) were applied to annotate D. purpurea micro- 
RNAs. The minimal folding free energy index (MFEI) was 
calculated as described previously [127]. 

Prediction of microRNA targets 

Prediction of microRNA targets was performed on the 
psRNATarget Server using the default parameters [70]. 

qPCR 

Total RNA was extracted from plant tissues using the 
Concert™ Plant RNA Reagent (Invitrogen) and then 
digested with RNase-free DNase I (Promega) to remove 
the genomic DNA contamination. Reverse transcription 
was performed on 2 [ig total RNA for each example by 
200 U Superscript III Reverse Transcriptase (Invitrogen) 
in a 20 \i\ volume. The reaction was carried out at 65°C 
for 5 minutes, 50°C for 60 minutes and 70°C for 15 min- 
utes. The resulting cDNA was diluted to 1,600 \i\ with 



sterile water. qPCR was carried out in triplicate reactions 
using the BIO-RAD CFX system (BIO-RAD). Gene-speci- 
fic primers were listed in (see Additional file 9). Ubiquitin 
gene was selected as a reference. PCR was carried out in a 
20 (il volume containing 2 \A diluted cDNA, 250 nM for- 
ward primer, 250 nM reverse primer, and 1 x SYBR Pre- 
mix Ex Taq II (TaKaRa) using the following conditions: 
95°C for 3 min, 40 cycles of 95°C for 15 sec, 60°C for 
15 sec and 72°C for 15 sec. Specificity of the amplification 
was verified by the Bio-Rad CFX Manage software based 
on dissociation curves. The results from gene-specific 
amplification were analyzed using the comparative Cq 
method which uses an arithmetic formula, 2" AACq , to 
achieve results for relative quantification [128]. Cq repre- 
sents the threshold cycle. 

3' RACE for determining the transcriptional direction of 
mlncRNAs 

3' RACE was performed on total RNA isolated from 
D. purpurea plantlets treated with cold stress for 1 hour 
using the GeneRacer kit (Invitrogen). PCR amplifications 
were carried out using the GeneRacer 3' primer and the 
nesting gene-specific primers. Nested PCR amplifications 
were performed using the GeneRacer 3' nested primer 
and the nested gene-specific primers (see Additional file 
10). PCR products were purified, cloned and sequenced. 

Additional material 



Additional file 1: Unigenes probably involved in cardiac glycoside 
biosynthesis. Complete set of unigenes probably involved in cardiac 
glycoside biosynthesis. 

Additional file 2: The 25 most abundant unigenes in D. purpurea. 

Complete set of the 25 most abundant unigenes in D. purpurea. 

Additional file 3: Housekeeping npcRNAs identified. Complete set of 
the housekeeping npcRNAs identified. 

Additional file 4: D. purpurea mlncRNA families. Complete set of the 
D. purpurea mlncRNA families. 

Additional file 5: Conserved mlncRNAs. Complete set of the 
conserved mlncRNAs. 

Additional file 6: Expression of housekeeping genes. (A) Cq value of 
ubiquitin, actin and 18s rRNA in leaves (L), stems (S), flowers (F) and 
roots (R) of one-year-old D. purpurea plants and plantlets treated with 
cold (CO, C1, C5, C10 and C24) and dehydration (DO, D1, D5, D10 and 
D24) stresses for 0, 1, 5, 10 and 24 hours. Expression levels were 
quantified by qPCR. The mean Cq value of three PCR replicates is shown. 
(B) The expression stability value (M) was calculated by geNorm software 
[87]. 

Additional file 7: Protein-coding genes significantly homologous or 
complementary to mlncRNAs. Complete set of the protein-coding 
genes significantly homologous or complementary to mlncRNAs. 

Additional file 8: The putative biosynthetic pathway of cardiac 
glycosides in Digitalis purpurea. The pathway roughly comprises 
terpenoid backbone biosynthesis, steroid biosynthesis and cardenolide 
biosynthesis three stages. The enzymes with corresponding unigenes are 
framed. 

Additional file 9: Primers used in qPCR. Complete set of the primers 
used in qPCR. 
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Additional file 10: Primers used for indentifying the transcriptional 
direction of mlncRNAs. Complete set of the primers used for 
indentifying the transcriptional direction of mlncRNAs. 
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amplification of cDNA ends; RT-PCR: reverse transcription-RCR; siRNAs: 
small interfering RNAs; snoRNAs: Small nucleolar RNAs; SnRK: SNF1 -related 
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diphosphate synthase; sRNAs: small RNAs; ta-siRNA: trans small interfering 
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